Goto

Collaborating Authors

 hierarchical grouping


Self-Supervised Visual Representation Learning from Hierarchical Grouping

Neural Information Processing Systems

We create a framework for bootstrapping visual representation learning from a primitive visual grouping capability. We operationalize grouping via a contour detector that partitions an image into regions, followed by merging of those regions into a tree hierarchy.


Review for NeurIPS paper: Self-Supervised Visual Representation Learning from Hierarchical Grouping

Neural Information Processing Systems

Summary and Contributions: Post rebuttal update begins I thank the authors for addressing some of my concerns. I, however, disagree with several of the arguments put forward in the rebuttal. I have nevertheless updated my overall score as the authors provided/promised some of the requested experiments. I detail my concerns below: "aim of Self-supervised learning is to create universal visual representations": A substantial part of the community, however, is still interested in transferable representations. This pursuit is valuable in its own way as it tries to generalize to very novel settings with very limited data.


Self-Supervised Visual Representation Learning from Hierarchical Grouping

Neural Information Processing Systems

We create a framework for bootstrapping visual representation learning from a primitive visual grouping capability. We operationalize grouping via a contour detector that partitions an image into regions, followed by merging of those regions into a tree hierarchy. Across a large unlabeled dataset, we apply this learned primitive to automatically predict hierarchical region structure. These predictions serve as guidance for self-supervised contrastive feature learning: we task a deep network with producing per-pixel embeddings whose pairwise distances respect the region hierarchy. Experiments demonstrate that our approach can serve as state-of-the-art generic pre-training, benefiting downstream tasks.